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EXPRESSION MODULATING SEQUENCES-III 
FIELD OF THE INVENTION 

5 

The present invention relates generally to novel nucleic acid molecules capable of increasing 
expression of nucleotide sequences in eukaryotic cells. The novel nucleic acid molecules of the 
present invention may be used to increase and/or stabilise or otherwise facilitate expression of 
nucleotide sequences resulting in the presence of a translation product or may be used to down 

10 regulate expression by, for example, promoting transcript degradation via mechanisms such as 
co-suppression. The nucleotide sequence of the present invention is referred to herein as an 
"expression modulating sequence" and generally results in the acquisition of a phenotypic trait 
or loss of a phenotypic trait. The expression modulating sequence of the present invention is 
useful inter alia to increase and/or stabilise or otherwise facilitate expression of nucleotide 

1 5 sequences in eukaryotic cells and in particular the expression of therapeutically, agriculturally and 
economically important transgenes. The expression modulating sequence of the present 
invention may also be used to inhibit, reduce or otherwise down regulate expression of a 
nucleotide sequence such as a eukaryotic gene including a pathogen gene, the expression of 
which, results in an undesired phenotype. 

20 

BACKGROUND OF THE INVENTION 

Recombinant DNA technology is now an integral part of strategies to generate genetically 
modified eukaryotic cells. For example, genetic engineering has been used to develop varieties 
25 of plants with commercially useful traits and to produce mammalian cells which express a 
therapeutically useful gene or to suppress expression of an unwanted gene. Transposons have 
played an important part in the genetic engineering of plant cells and some non-plant cells to 
provide inter alia tagged regions of genomes to facilitate the isolation of genes by recombinant 
DNA techniques. 

30 
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The maize transposon Activator (Ac) and its derivative Dissociation (Ds) comprise one of the 
first transposon systems to be discovered ( 1 ,2) and was first used to clone genes by Fedoroff et 
al (3). The behaviour of Ac in maize has been studied extensively and excision occurs in both 
somatic and germline tissue. Studies have highlighted two important features of Ac/Ds for 
5 tagging. First, the transposition frequency and second, the preference of Ac/Ds for transposition 
in linked sites. 

The use of the Ac/Ds system has been hampered by the difficulty of data interpretation due, for 
example, to the high activity of Ac in certain plants and insertions at unlinked sites arising from 
multiple transpositions rather than by a single event from the T-DNA. This problem was 
addressed by Jones et al (4), Carroll et al (5) and others where a two component Ac/Ds system 
was developed. In this system, the Ds elements were made by replacing the Ac transposase gene 
with a marker gene thereby rendering it non-autonomous. T-DNA regions of binary vectors 
were constructed by Carroll et al (5) and Scofield et al (6) carrying either a Ds element or a 
stabilised Activator transposase gene (sAc). The Ds element contained a reporter gene (eg. 
nos. BAR) which was shown to be inactivated on crossing with plants carrying the sAc (5). This 
is referred to as transgene silencing. It has been shown that transgene silencing is a more general 
phenomenon in transgenic plants (7, 8, 9). Many different types of transgene silencing have now 
been reported in the literature and include: co-suppression of a transgene and a homologous 
endogenous plant gene (10), inactiviation of ectopically located homologous transgenes in 
transgenic plants (7), the silencing of transgenes leading to resistance to virus infection (1 1) and 
inactivation of transgenes inserted in maize transposons in transgenic tomato (5). 

Gene silencing undoubtedly reflects mechanisms of great importance in the understanding of 
25 plant gene regulation. It is of particular importance because it represents a severe obstacle to 
stable and high level expression of economically important transgenes (7). 

In work leading up to the present invention, the inventors sought to identify nucleotide sequences 
which might prevent or otherwise reduce gene silencing and to facilitate increased and/or 
30 stabilized gene expression in eukaryotic cells such as plant cells. In accordance with the present 
invention, the subject inventors have now identified and isolated novel nucleotide sequences 
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referred to herein as "expression modulating sequences" or "EMSs" which are useful in 
increasing or stabilizing nucleotide sequence expression in eukaryotic cells such as plant cells. 
Such increased and stabilised nucleotide sequence expression can also lead to the promotion or 
induction of transcript degradation via mechanisms such as co-suppression. Accordingly, the 
5 EMSs of the present invention may also be used to inhibit, reduce or otherwise down-regulate 
expression of target nucleotide sequences. 

SUMMARY OF THE INVENTION 

10 Throughout this specification, unless the context requires otherwise, the word "comprise", or 
variations such as "comprises" or "comprising", will be understood to imply the inclusion of a 
stated element or integer or group of elements or integers but not the exclusion of any other 
element or integer or group of elements or integers. 

15 Sequence Identity Numbers (SEQ ID NOs.) for the nucleotide sequences referred to in the 
specification are defined following the bibliography. A summary of the SEQ ID NOs is given 
in Table 1. 

One aspect of the present invention provides an isolated nucleic acid molecule comprising a 
20 sequence of nucleotides which modulates expression of a second nucleotide sequence inserted 
proximal to said first mentioned nucleotide sequence. 

More particularly, the present invention is directed to an isolated nucleic acid molecule 
comprising a sequence of nucleotides which increases or enhances expression of a second 
25 nucleotide sequence inserted within said first mentioned nucleotide sequence. 

Another aspect of the present invention relates to an expression modulating sequence (EMS) 
comprising a sequence of nucleotides which increases or enhances expression of a nucleotide 
sequence inserted adjacent to, within or otherwise proximal to said EMS. 

30 
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Still another aspect of the present invention contemplates a genetic construct comprising an EMS 
as herein defined and means to facilitate insertion of a nucleotide sequence within, adjacent to 
or otherwise proximal to said EMS. 

5 Still yet another aspect of the present invention provides a genetic construct comprising an EMS 
as herein defined and means to facilitate insertion of a nucleotide sequence within, adjacent to 
or otherwise proximal with said EMS and operably linked to a promoter. 

Another aspect of the present invention contemplates a method of increasing or stabilizing 
10 expression of a nucleotide sequence or otherwise preventing or reducing silencing of a nucleotide 
sequence in a eukaryotic cell said method comprising introducing into said eukaryotic cell the 
nucleotide sequence flanked by, adjacent to or otherwise proximal to an EMS. 

More particularly, the present invention provides a method of increasing of stabilizing expression 
15 of a nucleotide sequence or otherwise preventing or reducing silencing of a nucleotide sequence 
in a plant or cells of a plant said method comprising introducing into said plant or plant cells the 
nucleotide sequence flanked by, adjacent to or otherwise proximal with an EMS. 

In an alternative embodiment, the present invention provides a method of inhibiting, reducing or 
20 otherwise down-regulating expression of a nucleotide sequence in a eukaryotic cell, said method 
comprising introducing into said eukaryotic cell the nucleotide sequence flanked by, adjacent to 
or otherwise proximal with an EMS. 

More particularly, the present invention is directed to a method of inhibiting, reducing or 
25 otherwise down-regulating expression of a nucleotide sequence in a plant or cells of a plant said 
method comprising introducing into said plant or plant cells the nucleotide sequence flanked by, 
adjacent to or otherwise proximal with an EMS. 

Yet another aspect of the present invention provides a transgenic animal or plant carrying a 
30 nucleotide sequence flanked by, adjacent to or otherwise proximal with an EMS. 
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Still a further aspect of the present invention provides an improved transposon tagging system, 
said system comprising a transposable element carrying a nucleotide sequence flanked by, 
adjacent to or otherwise proximal with an EMS. 



TABLE 1 
SUMMARY OF SEQ ID NOs. 



SEQ ID NO. DESCRIPTION 



10 1 Nucleotide sequence of tomato a-amylase gene promoter 

2 Nucleotide sequence of a-amylase gene promoter 

3 Nucleotide sequence of genomic DN A upstream of Dem 
gene followed by Dem cDNA coding sequence. 

4 Nucleotide sequence upstream of Ds insertion (ie. 
upstream of the nos.BAR gene) in a putative patatin gene 
in tomato 

5 Nucleotide sequence downstream of Ds insertion (ie. 
downstream of the nos.BAR gene) in a putative patatin 
gene in tomato 

1 5 6 Nucleotide sequence of portion of putative tomato 

homologue of potato patatin gene 
7 Nucleotide sequence of portion of potato patatin gene 
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BRIEF DESCRIPTION OF THE FIGURES 

Figure 1 is a diagrammatic representation showing T-DNA regions of binary vectors carrying 
a Ds element (SLJ1561) of the transposable gene (SLJ10512)[5]. The Ds element carries a 
5 nos:BAR gene and is inserted into a nos:SPEC excision marker. The transposon gene sAc is 
linked to a T.Gus reporter gene. 

Figure 2 is a diagrammatic representation showing an experimental strategy for generating 
tomato lines carrying transposed Ds elements (5). Fl plants heterozygous for both the Ds and 
10 sAc T-DNAs are test-crossed to produce TC, progeny. The TC, progeny are then screened for 
lines carrying a transposed Ds and a reactivated nos:BAR gene. 

Figure 3 is a photographic representation showing expression and silencing of the nos.BAR gene 
in various tomato lines. Seedlings were germinated in the presence of phosphinothricin for 
15 several weeks and then photographed. A. 156 IE, B. UQ406, C. Non-transformed (i.e. does not 
carry the nos. BAR gene), D-F. Three tomato lines that cany silent nos:BAR genes. 

Figure 4 is a representation showing methylation of a genetically engineered Ds transposon in 
transgenic tomato. Two separate Southern analyses were conducted on 7 individual genotypes; 

20 genomic DNA was extracted from leaf tissue (5). The restriction enzymes and probes (shaded 
boxes) used are shown on the figure. Lanes: 1. Non transformed (i.e. no Ds or nos:BAR gene), 
2. 156 IE which carries an active nos. BAR gene (due to the fact that it has never been exposed 
to the transposase gene), 3-6. Four tomato lines that carry silent nos.BAR genes, 7. UQ406 
which carries an active nos.BAR gene due to insertion of the Ds in the a-amylase promoter. The 

25 enzymes Sstll (abbreviated Ss) and NotI (abbreviated Nt) are methylation sensitive, whereas 
BstYI (abbreviated Bs) and EcoRI (abbreviated RI) are not. The expected size fragment for 
unmethylated DNA is indicated by the arrow; larger fragments (as in the silent lines) indicate 
methylation of the DNA at the Sstll or Notl sites. 



30 
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Figure 5 is a representation showing a sequence comparison between the potato a-amylase 
promoter (15) [SEQ ID NO:2] and the tomato a-amylase promoter [SEQ ID NO:l]. The 
location of the UQ406 insertion is shown. 

5 Figure 6 is a representation of a nucleotide sequence [SEQ ID NO: 3] of genomic DNA from 
65 1 bp upstream of the Ds insertion in UQ406 to the beginning of the Dem coding sequence, 
followed by the Dem cDNA sequence from the ATG start site at base pair 4097. The target 
sequences of the Ds insertion in UQ406 and Dem ATG are underlined. The Dem cDNA 
sequence is shown in italics and underlined. 

10 

Figure 7 is a photographic representation showing a stable mutant and a somatic revertant of 
the Dem locus. The seedling at the right in the background is homozygous for the Ds insertion 
in the Dem gene. The stable mutant fails to develop beyond the stage shown in the figure. The 
somatic revertant in the foreground is homozygous for the Ds insertion at the zygotic stage of 
15 development, but it also inherited a transposase gene which causes Ds excision and reversion of 
the phenotype to wild-type. Somatic revertants are characterized by abnormal cotyledons but 
develop a functional shoot meristem due to Ds excision and restoration of Dem function. Each 
somatic revertant represents an independent transposition event. 

20 Figure 8 is a diagrammatic representation showing an improved transposon tagging strategy 
using Dem as excision marker. The sAc and Ds parent lines are represented by the upper left and 
right boxes, respectively. Because the sAc is linked to the dem mutant +7 allele, somatic 
revertants can theoretically occur at about the frequency of 1 out of 4 in the Fl progeny. Each 
somatic revertant represents an independent transposition event. Chr4, chromosome 4 of 

25 tomato. 

Figure 9 is a diagrammatic representation showing plant expression vector pZorz carrying 
Osa.Luc (12). 
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DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS 

The present invention is predicated in part on the elucidation of the molecular basis of 
transposase-mediated silencing of genetic material located within a transposable element. 
5 Although, in accordance with the present invention, the molecular basis of gene silencing has 
been determined with respect to plant selectable marker genes within the Ds element of the 
Ds/Ac maize transposon system, the present invention clearly extends to the silencing of any 
nucleotide sequence and in particular a transgene and to mechanisms for alleviating gene 
silencing. In accordance with the present invention, nucleotide sequences have been identified 
10 which alleviate gene silencing and which increase or stabilise expression of genetic material. 
Furthermore although the present invention is particularly exemplified in relation to plants, it 
extends to all eukaryotic cells such as cells from mammals, insects, yeasts, reptiles and birds. 

Accordingly, one aspect of the present invention provides an isolated nucleic acid molecule 
15 comprising a sequence of nucleotides which increases or stabilizes expression of a second 
nucleotide sequence inserted proximal to said first mentioned nucleotide sequence. 

The term "proximal" is used in its most general sense to include the position of the second 
nucleotide sequence near, close to or in the genetic vicinity of the first mentioned nucleotide 

20 sequence. More particularly, the term "proximal" is taken herein to mean that the second 
nucleotide sequence precedes, follows or is flanked by the first mentioned nucleotide sequence. 
Preferably, the second nucleotide sequence is within the first mentioned nucleotide sequence and, 
hence, is flanked by portions of the first nucleotide sequence. Generally, the second nucleotide 
sequence is flanked by up to about 10 kb either side of first mentioned nucleotide sequence, more 

25 preferably up to about 5 kb, even more preferably up to about 4 kb either side of said first 
mentioned nucleotide sequence and even more preferably up to about 10 bp to about 1 kb. 

Accordingly, another aspect of the present invention is directed to an isolated nucleic acid 
molecule comprising a sequence of nucleotides which stabilises, increases or enhances expression 
30 of a second nucleotide sequence inserted into, flanked by, adjacent to or otherwise proximal to 
the said first mentioned nucleotide sequence. 
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The term "expression" is conveniently determined in terms of desired phenotype. Accordingly, 
the expression of a nucleotide sequence may be determined by a measurable phenotypic change 
involving transcription and translation into a proteinaceous product which in turn has a 
phenotypic effect or at least contributes to a phenotypic effect. Alternatively, expression may 
5 involve induction or promotion of transcript degradation such as during co-suppression resulting 
in inhibition, reduction or otherwise down-regulation of translatable product of a gene. In the 
latter case, the nucleic acid molecules of the present invention may result in production of 
sufficient transcript to induce or promote transcript degradation. This is particularly useful if a 
target endogenous gene is to be silenced or if the target sequence is from a pathogen such as a 
10 virus, bacterium, fungus or protozoan. In all instances "expression" is modulated but the result 
is conveniently measured as a phenotypic change resulting from increased or stabilised 
production of transcript, resulting in increased or stabilised translation product or increased or 
enhanced transcript production leading to transcript degradation such as in co-suppression 
resulting in loss of translation product. 

15 

The second mentioned nucleotide sequence is preferably an exogenous nucleotide sequence 
meaning that it is either not normally indigenous to the genome of the recipient cell or has been 
isolated from a cell's genome and then re-introduced into cells of the same plant or animal, same 
species of plant or animal or a different plant or animal. More preferably, the exogenous 
20 sequence is a transgene or a derivative thereof which includes parts, portions, fragments and 
homologues of the gene. 

The first mentioned nucleotide sequence described above is referred to herein as an "expression 
modulating sequence" (EMS) since it functions to and is capable of increasing or stabilizing 
25 expression of an exogenous nucleotide sequence such as a transgene or its derivatives. This in 
turn may have the effect of alleviating silencing of an exogenous nucleotide sequence or may 
promote transcript degradation such as via co-suppression. The latter is particularly useful as 
a defence mechanism against pathogens such as but not limited to plant viruses and animal 
pathogens. 

30 
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Accordingly, another aspect of the present invention relates to an expression modulating 
sequence (EMS) comprising a sequence of nucleotides which increases, enhances or stabilizes 
expression of a second nucleotide sequence inserted within, adjacent to or otherwise proximal 
to said EMS. 

5 

The term "modulating" is used to emphasise that although transcription may be increased or 
stabilised, this may have the effect of either permitting stabilised or enhanced translation of a 
product or inducing transcription degradation such as via co-suppression. 

10 The EMSs of the present invention were identified, in accordance with the present invention, 
following transposon mutagenesis of plants with the Ds/Ac transposon system. The Ds element 
carries a reporter gene (nos:BAR) which is normally silenced upon exposure to the transposase 
gene. In a few cases, plants are detected in which nos:BAR expression is not silenced. In 
accordance with the present invention, it has been determined that the Ds element inserts within, 

15 adjacent to or otherwise proximal with an EMS which results in increased or stabilized 
expression of the nos:BAR. In other words, the EMS facilitates expression of a gene and 
preferably an exogenous gene or a transgene. This in turn may result in gene product being 
produced or induction of transcript degradation such as via co-suppression. 

20 The EMSs of the present invention are conveniently provided in a genetic construct. 

Accordingly, another aspect of the present invention contemplates a genetic construct comprising 
an EMS as herein defined and means to facilitate insertion of a nucleotide sequence within, 
adjacent to or otherwise proximal with said EMS. 

25 

The term "genetic construct" is used in its broadest sense to include any recombinant nucleic acid 
molecule and includes a vector, binary vector, recombinant virus and gene construct. 

The means to facilitate insertion of a nucleotide sequence include but are not limited to one or 
30 more restriction endonuclease sites, homologous recombination, transposon insertion, random 
insertion and primer and site-directed insertion mutagenesis. Preferably, however, the means is 
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one or more restriction endonuclease sites. In the case of the latter, the nucleic acid molecule 
is cleaved and another nucleotide sequence ligated into the cleaved nucleic acid molecule. 

Preferably, the inserted nucleotide sequence is operably linked to a promoter in the genetic 
5 construct. 

According to this embodiment, there is provided a genetic construct comprising an EMS as 
herein defined and means to facilitate insertion of a nucleotide sequence within, adjacent to or 
otherwise proximal with said EMS and operably linked to a promoter. 

10 

Conveniently, the genetic construct may be a transposable element such as but not limited to a 
modified form of Ds. A modified form of Ds includes a Ds molecule comprising an EMS and 
a nucleotide sequence such as but not limited to a reporter gene, a gene conferring a particular 
trait on a plant cell or a plant regenerated from said cell or a gene which will promote co- 
15 suppression of an endogenous gene. 

Another aspect of the present invention contemplates a method of increasing or stabilising 
expression of a nucleotide sequence or otherwise preventing or reducing silencing of a nucleotide 
sequence or promoting transcription degradation of an endogenous gene in a plant or animal or 
20 cells of a plant or animal, said method comprising introducing into said plant or animal or plant 
or animal cells said nucleotide sequence flanked by, adjacent to or otherwise proximal with an 
EMS. 

In an alternative embodiment, there is provided a method of inhibiting, reducing or otherwise 
25 down-regulating expression of a nucleotide sequence in a plant or animal or cells of a plant or 
animal, said method comprising introducing into said plant or animal or plant or animal cells the 
nucleotide sequence flanked by, adjacent to or otherwise proximal with an EMS. 

Yet another aspect of the present invention provides a transgenic plant or animal carrying a 
30 nucleotide sequence flanked by, adjacent to or otherwise proximal to an EMS. As a 
consequence of the EMS, the expression of the exogenous nucleotide sequence is increased or 
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stabilised resulting in expression of a phenotype or loss of a phenotype. 

Although not intending to limit the present invention to any one theory or mode of action, the 
EMS is proposed to comprise a methylation resistance sequence. A methylation resistance 
5 sequence is one which may de-methylate and/or prevent or reduce methylation of a nucleotide 
sequence such as a target nucleotide sequence. 

According to this aspect, the DNA methylation resistant sequence may prevent inhibition of 
transcription or delay mRNA transcript turnover. This can enhance, increase or stabilise an 
10 transcript and translation into a gene product or may induce or promote transcript degradation 
such as via co-suppression. 

The present invention further provides for an improved transposon tagging system. 

15 One system employs a modified Ds element which now carries an EMS. 

Accordingly, another aspect of the present invention is directed to an improved transposon 
tagging system, said system comprising a transposable element carrying a nucleotide sequence 
flanked by, adjacent to or otherwise proximal with an EMS. 

20 

Another new system employs the Dem gene or its derivatives as an excision marker. Reference 
to "derivatives" include reference to mutants, parts, fragments and homologues of Dem including 
functional equivalents. The Dem gene is required for cotyledon development and shoot and root 
meristem function. Stable Ds insertion mutants of Dem germinate but fail to develop any further. 
25 However, unstable mutants in the Dem locus result in excision of the Ds element and reversion 
of the Dem locus to wild-type, thereby restoring function to the shoot meristem. In accordance 
with the present invention, the new system enables selection for transposition. 

In accordance with the improved method, transposition is initiated by crossing a Ds line with a 
30 stabilized Ac (sAc) line. The Ds line is heterozygous for a Ds insertion in the Dem gene and the 
sAc line is heterozygous for a stable mutation in the Dem gene. A particularly useful mutant in 
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the Dem gene is a frameshift mutation. Both of the Ds and sAc containing plant lines are wild- 
type due to the recessive nature of the Ds insertion and mutant alleles. The ¥ x progeny derived 
from crossing the Ds and sAc lines segregate at a ratio of 3 wild-types to 1 mutant. Because the 
sAc is linked to the frameshift dem allele, almost all of the mutants also inherit the transposase 
5 gene and can undergo somatic reversion. These revertant individuals have abnormal 
cotyledons, but Ds excision from the Dem gene restores function to the shoot apical meristem. 
Each somatic revertant represents an independent transposition event from the Dem locus. By 
screening for expression of a gene resident on the Ds element (e.g. nos:BAR), the identification 
of EMSs is readily determined. 

10 

The present invention also provides in vivo bioassays for expressed transgenes. The bioassays 
identify nucleotide sequences which prevent transgene silencing. 

In one aspect, the plant expression vector pZorz (see Figure 5) carries a firefly luciferase reporter 
15 gene (luc), under the control of the Osa promoter (12). After bombardment, the gene is 
expressed in embryogenic sugarcane callus. However, it becomes completely silenced upon plant 
regeneration. The silencing appears to be correlated with methylation of the transgene. Genetic 
sequences flanking reactivated nos:BAR insertions are inserted in the pZorz vector at the Hindlll 
site upstream from the Osa promoter. These modified pZorz constructs are then used with a 
20 transformation marker to transform sugarcane in order to test whether the plant sequences are 
capable of alleviating silencing of the luc gene upon plant regeneration. Restriction endonuclease 
fragments capable of alleviating silencing of the luc gene are subcloned by deletion analysis into 
smaller fragments to define the sequence more accurately. 

25 In another aspect, a plant expression vector is constructed for testing the EMSs in 
Agrobacterium-transfonued Arabidopsis. EMSs are placed upstream of the nos:luc or nos:gus 
gene linked to a transformation marker and used to test whether EMSs stabilise expression of 
the nos:luc or nos:gus gene in Arabidopsis. 



30 
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These aspects of the present invention are clearly extendable to assays using other plants and the 
present invention contemplates the subject assay and plant expression vector for use in a range 
of plants in addition to sugar cane. 

5 The present invention further described by the following non-limiting Examples. 

EXAMPLE 1 
Ds/sAc Transposon system 

10 The inventors have previously developed a two component Ds/sAc transposon system in 
transgenic tomato for tagging and cloning important genes from plants (5, 13). The 
components of the system are shown in Figure 1 and comprise: i) a non-autonomous 
genetically-engineered Ds element (e.g. SLJ1561), and ii) an unlinked transposase gene sAc 
(SLJ10512), required for transposition of the Ds element. To activate transposition, the two 

15 components are combined by crossing transformants for each component. A plant selectable 
marker gene, e.g. nos.BAR, is inserted into the Ds element to enable selection for reinsertion 
of the elements following excision from the T-DNA (Figure 1). Surprisingly, the marker gene 
is irreversibly inactivated when the Ds line is crossed to a transformant expressing the 
transposase gene (5). Silencing occurred when the Ds element remained in the T-DNA, and 

20 also occurred in the great majority of cases when the Ds element transposed to a new location 
in the tomato genome. None of the other marker genes in the T-DNA is silenced. The silenced 
marker gene has been shown to be stably inherited, even after the transposase gene segregates 
away from the Ds element in subsequent generations. 



25 
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EXAMPLE 2 

Transposon tagging of a chromosomal region enabling 
full expression of the nos.BAR transgene 

5 The experimental strategy for generating tomato lines carrying transposed Ds elements from 
T-DNA 1561E is shown in Figure 2. The Ds element in 1561E carries a nos. BAR marker gene. 
In construction of the Ds, the 5' end of the nos promoter is cloned into the Xho I site, 1 100 bp 
from the 3' end of Ac. As a strategy to tag regions of the tomato genome associated with high 
level gene expression, hundreds of plants carrying transposed Ds elements are screened for 

10 resistance to phosphinothricin (PPT), the selection agent for the BAR gene. Several lines are 
identified which show at least some level of resistance. One line, called UQ406, carries a single 
transposed Ds element (without the transposase gene which has segregated away) and is 
resistant to PPT (Figure 3). Stable inheritance of BAR gene expression in this line has been 
demonstrated through several generations. These results indicate that the strategy for tagging 

15 active chromosomal regions by screening for PPT resistance is a successful approach. Southern 
hybridization analysis of the original Ds transformant 156 IE, UQ406 and several lines carrying 
silenced nos. BAR transgenes indicates that silencing is correlated with methylation of the Sstll 
site in the nos promoter (Figure 4). Total leaf tissue is used in this analysis, and the SstU site 
in the nos promoter in UQ406 is partially methylated. In silent nos. BAR genes, a Notl site 

20 immediately downstream from the coding sequence is also methylated (Figure 4). In UQ406, 
the Notl site is unmethylated, as in 156 IE (Figure 4). 

EXAMPLE 3 
Cloning sequences flanking active nos:BAR genes 

25 

GenomeWalker (14) is used to clone the tomato DNA sequences flanking the Ds element in 
UQ406. The DNA flanking the Ds element in line UQ406 is cloned and sequenced, and a 
search of the PROSITE database reveals that the Ds has inserted into the promoter region of 
an a-amylase gene. The promoter [SEQ ID NO:l] shows strong homology to an a-amylase 
30 promoter of potato (15; Figure 5) [SEQ ID NO:2] and the coding sequence of the gene has 
strong homology with one of 3 reported potato a-amylase cDNAs (16). The DNA from 651 bp 
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upstream of the UQ406 insertion to the end of the Dem coding sequence, has been sequenced 
(Figure 6) [SEQ ID NO:3]. 

EXAMPLE 4 

5 An improved transposon tagging strategy for transgenic tomato 

The inventors have used the transposon tagging system described in Example 1 (also see Figure 
2) to tag and clone two important genes involved in shoot morphogenesis. The DCL gene is 
required for chloroplast development and palisade cell morphogenesis (13) and the Dem 

10 (Defective Embryo Mer stem) gene is required for cotyledon development and shoot and root 
meristem function. Stable Ds insertion mutants of Dem germinate but fail to develop any further 
(Figure 7). Figure 7 also shows an example of an unstable mutant of the Dem locus. Upon 
germination, these variegated seedlings appear at first to be mutant. However, the transposase 
gene activates transposition of the Ds and reversion of the Dem locus to wild-type, thereby 

15 restoring function to the shoot meristem. 

While the transposon tagging system described in Figure 2 has been successful in tagging genes 
and a chromosomal region alleviating transgene silencing, it does have two associated 
inefficiencies. First, transposition cannot be selected in the shoot meristem of F, plants 
20 heterozygous for Ds and sAc. As a consequence, many TQ progeny derived from test-crossing 
these F, plants still have the Ds located in the T-DNA. The other limitation of the system is that 
sibling TC, progeny derived from a single Fi plant often carry the same clonal transposition and 
reinsertion event. The extent of clonal events amongst sibling TC, progeny can only be 
monitored by time consuming and expensive Southern hybridisation analysis. 

25 

These two inefficiencies in the transposon tagging strategy are overcome in accordance with the 
present invention by using the Dem gene as an excision marker. The new system enables 
selection for transposition in the shoot apical meristem and visual identification of plants carrying 
independent transposition events. Transposition is initiated by crossing a Ds line with a sAc line 
30 (Figure 8). The Ds line is heterozygous for a Ds insertion in the Dem gene and the sAc line is 
heterozygous for a stable frameshift mutation in the Dem gene (Figure 8). The frameshift allele 
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is derived from a Ds excision event from the Dem locus. Both the Ds and sAc lines are wild-type 
due to the recessive nature of the Ds insertion and frameshift alleles. PCR tests on intact leaf 
tissue have been developed for the rapid identification of these Ds and sAc parental lines. The 
Fj progeny derived from crossing the Ds and sAc lines segregate at the expected ratio of 3 wild- 
5 types to 1 mutant. Because the sAc is linked to the frameshift dem allele, almost all of the F! 
mutants also inherit the transposase gene (sAc) and can undergo somatic reversion. These 
revertant individuals have abnormal cotyledons, but Ds excision from the Dem gene restores 
function to the shoot apical meristem (see Figure 7). Each somatic revertant represents an 
independent transposition event from the Dem locus. A non-destructive test for nos.BAR 

10 expression is used involving application of PPT (the selective agent for expression of BAR gene) 
to a small area of a leaf. Somatic revertants resistant to PPT are grown though to seed and the 
F 2 progeny are screened again for PPT resistance. Lines carrying transposed Ds elements 
expressing nos.BAR are selected for more detailed molecular analysis. Three independent 
insertions (UQ1 1, UQ12 and UQ14) carry active nos:BAR genes. The donor Ds was originally 

15 located in the Dem gene (Figure 4) and in that location in the Dem gene the nos. BAR gene was 
silent. 

The efficient saturation mutagenesis of this chromosomal region is dependent on the use of the 
Dem gene as a selectable marker for independent transposition events. A recombinant selectable 
20 marker for independent transpositions is produced and transformed into tomato for saturation 
mutagenesis in other chromosomal regions of tomato. This system may be introduced into any 
species possessing the dem mutation, in order to facilitate transposon tagging of genes. 

EXAMPLE 5 

25 Ds transposon tagging of a putative patatin gene 

DNA sequences flanking the active nos.BAR in a line designated UQ12 have similarly been 
cloned and sequenced. The flanking DNA appears to correspond to an intron in a homologous 
potato patatin gene. Patatin is the major protein in the potato tuber and has many potentially- 
30 important characteristics. 
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The sequence upstream of the Ds insertion (i.e. upstream of the nos.BAR gene) is as follows: 

AATCAAAGAG GAATTNAATT CCNCAAAATT TCATCCATAG ATTTTGNGTC 5 0 

TCTGAAAATT AAAGTGACTT TGTAATCTGA AACCTAGAGT CCTCAACCAT 10 0 

ATCATTGACC ATTAAGCCAT ACCCTTAAAT GTAGGGAATT TGAAGTTTTA 15 0 

AAAAC C AC AC TTTGTTATTT ATTGGCCCAA ATACTCGATA ATCTTTACAT 20 0 

TATTGAAAAT CAACATTCAA AAGGAACGAA CCTTCAATCA CACCATCAAT 2 50 

GTCAACTTTC TTTTATTTTG GATAATCTAA GTTTTTAAAT TGCAGTAAAA 3 00 

TNAAATAAAA CCCTAAACTT CTTCTAGGTT GAGACTTAGT AAATATGAAT 3 50 

TATATAAAGA ATTCATGACA AATGAGACAT AAGAATAGTG CCAGCAAATT 40 0 

AC TTTTTTG A TATCTTATCT GTGATATCGG AATTTTAACT ACCATAAATT 4 50 

TATGAATGAA ATATCACTTA TC T ATT AG AG AGGATTTAAT CTCCCTTATA 500 

ATGACATTGA TAAAAGCAAG N AC AAGTGC T CTTTATTTCT TAATTACAAA 5 50 

TCCTTAAATA GATAAAAGCT AC GAATAAC A TAATATCCTT AAATAGATAA 60 0 

AAGCTACGAA TAACATAATA GTATATTACT CCNAATTATT TTGATTTATT 6 50 

TAAAATG AC T CCACTAATCC TGATGTGGTC TAGG [SEQ ID NO : 4 ] 684 

The tomato sequence immediately downstream of the Ds insertion (i.e. downstream of the 
nos. BAR gene) is as follows: 

GGTCTAGGCC CTGGGTCTAG GAAACAAAAT AACTTATTTG ACTCCTAAAC 50 

AATAGCAACA TACAAACCAC TGATATTGTA CAAGTAAAAT TCAATAAAAT 10 0 

TCTAGCTCTC TCAAACACTT TTAAAATTGT TATTTCTGTT TTGTCTGTGT 150 

CATATTATGA CCTACACAAC AACAACAACA ACGAATTTAG TGAAACTCTA 200 

CAAAGTGGAG CCTGAAGTCG AGAGTTTACG CGGGCCTTAT CACTATCTTT 2 50 

TCGAGATAAA AAAATTATTT TTAAAAGATC ATCGACTTAA ACAAACCAAA 3 00 

CAATAATTAA AAAAATATGA ATTAATAGCA AAGCAGTGTG GACCATATAT 3 50 

ACAAAAATCT ATAACAACAA CAAGGTGCAG AGCATTATTC CAACTAAGAT 400 

CGAAGTTGTG ATACTGTCAT AATAAAAATG ACACATATTT TGACAACATA 4 50 

AAAAATAAAT AACCATAAAA TATATCATAG AAAAATGAAT ATATTAGAAC 50 0 

AGCTCACTCC AATATTAAAA GAGAGAAAAA AAATATTTTC CCACCACAAT 550 

GCCATAATCC TTGAGCTTAG CTATTTATAA GTAAAAAAAA TGTTTTC TTG 600 

GATAAATAGA AAAAGAAATA ATAATTAAAC ATAACCAATC ACTTCACAAA 65 0 

TAAGAGTGTA TT [SEQ ID NO: 5] 662 

The level of homology between the potato and our tomato sequence is as follows: 

Tomato: 3 07 ATTT ATTTTT AGGAAAAATT ATCTAAATAC AC ATCTTATTTTACC ATAT AC TC T AAAAAT 248 

I 1 1 1 1 I! II II I I Mill I IIIIIIIMI 1 1 1 I I II 11 1 1 MINIM I 

Potato: 1914 AATTAT ATTT AGGAAAAATT AC AT AAAT AC AC AACTTAAT AT ATTATATTCTCTAAAATT 197 3 

247 TCC 245 [SEQ ID NO: 6] 
III 

1974 TCC 1976 [SEQ ID NO : 7 ] 
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EXAMPLE 6 
Tagging of additional genes 

Selecting for transposition of a methylated Ds from the Dem locus and for expression of the 
5 nos:BAR gene (i.e.: demethylation of the Ds) efficiendy identifies Ds insertions into genes, as 
opposed to so-called "junk DNA". The sequences adjacent to five of these Ds insertions have 
been cloned and sequenced, and in all cases the Ds insertion is in the vicinity of a known gene. 

The five lines carrying active nos:BAR genes associated with genes are: 
10 • Ds insertion in UQ406 - associated with the promoter of an a-amylase gene (Example 

3, above); 

• Ds insertion in UQ12 - associated with a putative palatin gene (Example 5, above); 

• Ds insertion in UQ1 1 - associated with the Right Border of the Agrobacterium T-DNA 
in 1516E (refer to Figure 2). This was the T-DNA carrying the Ds that was initially 

15 transformed into tomato. In other words, the Ds transposed from the Dem locus back 

into the T-DNA; 

• Ds insertion in UQ14 - associated with or closely linked to a putative sucrose synthase 
gene; and 

• Ds insertion in UQ13 - associated with or closely linked to a putative UDP-glucose- 
20 pyrophosphorylase gene. 

In four of these instances, the Ds has inserted into or near sequences homologous to carbon 
metabolism genes. These data indicate that many C metabolism genes and many so called house- 
keeping genes contain de-methylation sequences or sequences which prevent or reduce 
25 methylation. 



(I 
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EXAMPLE 7 

A rapid bioassay for identification of tomato DNA sequences 
capable of alleviating transgene silencing in a heterologous plant species 

5 An efficient transformation system has been developed for sugarcane, based on particle 
bombardment of embryogenic alleles, followed by plant regeneration (17). The bioassay is useful 
for identifying tomato sequences which prevent transgene silencing and employs the plant 
expression vector pZorz (Figure 9). This plasmid carries a firefly luciferase reporter gene (luc) f 
under the control of the Osa promoter (12). After bombardment of embyrogenic callus of sugar 

10 cane, the luciferase gene is expressed as observed by visualisation of the chemiluminescence of 
the luciferase enzyme. However, it becomes completely silenced upon plant regeneration in 
normal sugar cane. This is used to test the system. The silencing appears to be correlated with 
methylation of the transgene. Tomato sequences flanking reactivated nos:BAR insertions are 
inserted in the pZorz vector at the Hindlll site upstream from the Osa promoter (Figure 10). 

15 These modified pZorz constructs are then used with a transformation marker to transform 
sugarcane in order to test whether the tomato sequences are capable of alleviating silencing of 
the luc gene. They are then subcloned by deletion analysis into smaller fragments to more 
accurately define the sequences. 

20 Tomato sequences flanking reactivated nos:BAR insertions are also introduced next to a 
nos:BAR, nos. LUC or nos:GUS recombinant gene in another plasmid vector. These modified 
recombinant BAR, LUC and GUS genes are inserted into binary vectors (4) for transformation 
into Arabidopsis thaliana (18) to test the ability to prevent silencing of the nos:BAR gene in 
Arabidopsis. 



25 
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EXAMPLE 8 

Analysis of sequences responsible for reactivating nos.BAR expression 

The borders of DNA elements that prevent transgene silencing are initially defined by deletion 
5 analysis of clones that yield positive results in the bioassays. The smallest active clone for each 
chromosomal region is then sequenced and characterised in detail. Sequences from independent 
Ds insertions are compared for homologous DNA elements. 

Those skilled in the art will appreciate that the invention described herein is susceptible to 
10 variations and modifications other than those specifically described. It is to be understood that 
the invention includes all such variations and modifications. The invention also includes all of the 
steps, features, compositions and compounds referred to or indicated in this specification, 
individually or collectively, and any and all combinations of any two or more of said steps or 
features. 
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(ii) TITLE OF INVENTION: EXPRESSION MODULATING SEQUENCES-III 

(iii) NUMBER OF SEQUENCES: 7 

(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: DA VIES COLLISON CAVE 

(B) STREET: 1 LITTLE COLLINS STREET 

(C) CITY: MELBOURNE 

(D) STATE: VICTORIA 

(E) COUNTRY: AUSTRALIA 

(F) ZIP: 3000 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release # 1 .0, Version # 1 .25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: AUSTRALIAN PROVISIONAL 

(B) FILING DATE: 25-SEP- 1998 

(vii) PRIOR APPLICATION DATA 

(A) APPLICATION NO. PP3901 

(B) FILING DATE: 4-JUNE-1998 

(viii) ATTORNEY/AGENT INFORMATION: 
(A) NAME: HUGHES, DR E JOHN L 

(C) REFERENCE/DOCKET NUMBER: EJH/EK 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: +61 3 9254 2777 

(B) TELEFAX: +61 3 9254 2770 

(C) TELEX: A A 31787 
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(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1217 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 



TTTGAAATTT 


ATGTATTTAT 


CTATAGCATT 


AGAAACTATA 


AGAGTTGTTA 


GCTTCACTTG 


60 


GCTTACTGTT 


GTGCTCAAAG 


CAACTTCATC 


ATCATACAGT 


ATGGTTTTGA 


TATGCTCTTC 


120 


CATTATCACT 


GAGCCTTATG 


ATTATGTTTT 


ACGAGCTTAT 


AATATCACTG 


ATGGTGATTC 


180 


AGTATTGTGA 


TTATGTCCTT 


CGTTG ATTAT 


TCTGTTTCAT 


ACAAGTCGTG 


T AATTTGC TG 


240 


TTTGTGACAG 


TACGATAGAT 


CGACTCAACC 


TTCTGAGGTA 


TTAGTTGAAG 


TTCATGTAAA 


300 


TTAGCTTTGT 


TTATCATAGT 


AGCATTTGAT 


TATTGATGCT 


CTGTAGCTAA 


TGATAAGCCA 


360 


TTGGAGGGAA 


GCAAGCTTTC 


TAAATGAATC 


TACGAATGGA 


TGATAAAGTT 


CATGAATATT 


420 


TTTGTTACTT 


CTGCAGTCAG 


ATCATGAGTT 


ATTGAGTCTA 


TTGTTTTTTT 


AAGCCTGTTT 


480 


CAGATGATCC 


ATCATCAGTA 


ACAACATACA 


CGGTGTAGTC 


CCAAATCCAT 


CATATGCACC 


540 


TTCTTTTCTT 


CAATTTGGTC 


TTGTTTTTTT 


TTTTTCATGA 


TGTCATTGAA 


TTATTCAAGA 


600 


AGTCACTTCG 


AGCATAATGA 


TTTTTCAAAA 


TCCACCTTTG 


TTCAAGCACT 


ACCACGTCTT 


660 


TTCATCTAGC 


CCACAACCGT 


GGTGGAGGAT 


C T AG AATTTT 


CATGAAAGGA 


TTCAAAATTT 


720 


ACAAACATAT 


ATATACACTA 


TACACTATGA 


ATCCACTAAT 


ACTAGATGGT 


GCACCTGTGC 


780 


CCCCACTCAT 


GTGAAAGCCT 


ATTCTCAATT 


TTTTATTTTC 


CACAACTTAA 


ATACAGACCG 


840 


CACAACTCCC 


GTGTCTTGTG 


TGCTCGTCGC 


TCAGCATGCA 


AGTCGAGAAA 


AGAAAGACCA 


900 


AAACAATGAA 


AACTTTACGA 


AAAATCAAAA 


AG TTG AAGG A 


CTTTAACGTC 


GAGATCTCTC 


960 


GTAGAAAACC 


TCTTTTGTAA 


GGTTGCATAC 


AATACTTTTT 


TTTCAGACTT 


TACTTATGGT 


1020 


ATTAT AC TG A 


ATATGTTATT 


GCTGTTATAG 


TAG TTG AG TG 


ACGTTTGAGG 


GAATTTCTAG 


1080 


TCCGTTAATC 


TTGTACTCAG 


TGTGTCTACT 


TTTCAAAAAA 


GTCAGTTTTT 


CAGTCTCTAA 


1140 


AACACATTTA 


AATAAGAGTT 


TCTTTGCCCA 


TCTTTTGTTC 


CTCATCCTAG 


GCTTGGAGTC 


1200 


AACACAACAC 


AACAACA 










1217 
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(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1114 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 



TTTGAAATTT 


ATGTATATAT 


CTGTAGCATT 


AGAAACTATA 


AGAGTTGTTA 


GCTTCACTTG 


60 


TCTTATTGTT 


GTGCTCAAAG 


CAACTTCATC 


ATACAGTATG 


GTTTTTATAT 


GCTCTTCCAT 


120 


TATCACCGAA 


CCTTATGATT 


ATGTGTACGA 


GCTTATAATA 


TTACTGATGG 


TGATTCAGTA 


180 


TTATGATTAT 


GTCCTCCATT 


AATTATTCTG 


TTTCATACAA 


GTCGTGTAAT 


TTGCTGTTTG 


240 


TGATTGTACG 


ATAAATTGAT 


TCAACCTTCT 


GCGGTGTTGG 


TTGAAGTTCA 


AGTAAATTAG 


300 


C TTTATTTAT 


A 1 AO 1 AJjU A 


mmrpo 71 mm t\ mm 


\Ji\ 1 VJV^ 1 1 VJ J. 


AGCTAATGAT 


AAGCCATTGA 


360 


AGGGAAGCAG 


AAATGGTAAA 


GCTTTCTAAA 


ATGAATCTAC 


GAATGGATGA 


TAAAGTTAAT 


420 


GAATATTGTT 


GATACTTCTG 


CAATCAGATT 


ATGAGTTACT 


GAGTCTACTG 


TTTTTTAAGC 


480 


CTGTTTCAGA 


TGATCGATCA 


TCAACAACAA 


CATATTCAGT 


GTAGTAGACA 


TGATCGATCA 


540 


CTTTCTAATT 


TTCGATTATG 


CACCCTCTTT 


TCTCCAATTT 


GGTCGTCTTC 


TTTTTTTCAT 


600 


GATGTCACTG 


AATTATTCTC 


TGGTCGTCCC 


CACCATTCAG 


GAAGTCACTT 


CGAGCATAAT 


660 


GTGAAAACAT 


CCACATTTTT 


CAAATCCAGC 


AGAATTTTCA 


TCAAACGGGG 


TTCAACATTT 


720 


ACTACATGTA 


TACACTCTGA 


AGTCTGAATC 


CACTAATTCT 


AGATGGTGCA 


TCTGTGCCCC 


780 


CACACTTGTG 


AAAGCTTATT 


CTCAATTTTT 


TATTTTCCAA 


CAACTTGAAT 


TCAGACCACA 


840 


CAACTCCCGT 


GTCTTGTACG 


GTCAGCATCT 


GAGTGGAGAA 


CTCAATTAAG 


TGACTTTAAC 


900 


GTCGAGTTCT 


ATAGTAAACA 


ACCCCTATAT 


CTTTTTTCAA 


GCATGTTAAG 


ATTGCGAACA 


960 


CACTGAAATT 


TCCAGGTCGT 


TAATC TTGTA 


CCCAGTGTGT 


GTACTTTTAA 


AAAAAAAAGT 


1020 


CAGTTTTTTA 


GTCTCTAAAA 


CACATTTAAA 


TAGAGTTTAT 


TTGCCATCTT 


TTGTTCCTCA 


1080 


TACTAGACTT 


CGGAGTCAAC 


ACAACACAAC 


AACA 






1114 
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(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6263 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 



CGACGGCCCG 


GGCTGGTAAA 


TGCGGAAGCT 


TGTTACAGAT 


TTGAAATTTA 


TGTATTTATC 


c n 
b U 


TATAGCATTA 


GAAACTATAA 


GAGTTGTTAG 


CTTCACTTGG 


CTTACTGTTG 


TGC TC AAAGC 


ion 


AACTTCATCA 


TCATACAGTA 


TGGTTTTGAT 


ATGCTCTTCC 


ATTATCACTG 


AG C C TT ATG A 


ion 


TTATGTTTTA 


CGAGCTTATA 


ATATC AC TG A 


TGGTGATTCA 


GTATTGTGAT 


TATGTCCTTC 


z 4 U 


GTTGATTATT 


C TGTTTC ATA 


CAAGTCGTGT 


AATTTGCTGT 


TTGTGACAGT 


AC G AT AG ATC 




GACTCAACCT 


TCTGAGGTAT 


TAGTTGAAGT 


TCATGTAAAT 


TAGCTTTGTT 


T ATC AT AGT A 


*5 c n 


GCATTTGATT 


ATTGATGCTC 


TGTAGCTAAT 


GATAAGCCAT 


TGGAGGGAAG 


CAAGCTTTCT 


/ion 


AAATGAATCT 


ACGAATGGAT 


GATAAAGTTC 


ATGAATATTT 


TTGTTACTTC 


TGCAGTCAGA 


/ion 


TCATGAGTTA 


TTGAGTCTAT 


TGTTTTTTTA 


AGCC TGTTTC 


AGATGATCCA 


TCATCAGTAA 


c ^ n 


CAACATACAC 


GGTGTAGTCC 


CAAATCCATC 


ATATGC AC C T 


TCTTTTCTTC 


AATTTGGTCT 


c n n 
bUU 


TGTTTTTTTT 


TTTTCATGAT 


GTCATTGAAT 


TATTCAAGAA 


GTC AC TTCGA 


y— i tv m TV TV TA fTl 

GCATAATGAT 


boU 


TTTTCAAAAT 


CCACCTTTGT 


TCAAGCACTA 


CCACGTCTTT 


TCATCTAGCC 


CACAACCGTG 


•-7 o r\ 


GTGGAGGATC 


TAGAATTTTC 


ATGAAAGGAT 


TCAAAATTTA 


CAAACATATA 


TATACACTAT 


*7 O A 

7 oO 


ACACTATGAA 


TCCACTAATA 


CTAGATGGTG 


CACCTGTGCC 


CCCACTCATG 


TGAAAGCCTA 


840 


TTCTCAATTT 


TTTATTTTCC 


AC AAC TT AAA 


TACAGACCGC 


ACAACTCCCG 


TGTCTTGTGT 


900 


GCTCGTCGCT 


CAGCATGCAA 


GTCGAGAAAA 


GAAAGACCAA 


AACAATGAAA 


AC TTTACG AA 


960 


AAATCAAAAA 


GTTGAAGGAC 


TTTAACGTCG 


AGATCTCTCG 


TAGAAAACCT 


CTTTTGTAAG 


1020 


GTTGCATACA 


AT AC TTTTTT 


TTCAGACTTT 


AC TTATGGTA 


TTATACTGAA 


TATGTTATTG 


1080 


CTGTTATAGT 


AGTTGAGTGA 


CGTTTGAGGG 


AATTTCTAGT 


CCGTTAATCT 


TGTACTCAGT 


1140 


GTGTCTACTT 


TTCAAAAAAG 


TCAGTTTTTC 


AGTCTCTAAA 


ACACATTTAA 


ATAAGAGTTT 


1200 


CTTTGCCCAT 


CTTTTGTTCC 


TCATCCTAGG 


GTTGGAGTCA 


ACACAACACA 


AC AAC AATGA 


1260 


ATTTCCATTT 


TTCTGTTTCT 


TTACTTCTCT 


CTTTATCTCT 


TCCTATGTTT 


GCCTCTTCGA 


1320 


CGGTGTTATT 


TCAGGTATCC 


ATCTCCAAAG 


AACCTTATTT 


TTCTCTTAAC 


TTTTCCTATG 


1380 


TATATGTATC 


TCTATGTTTA 


TGTAGTACTT 


GCTCAAGTAT 


ATAAAGAAAA 


GTTAGTTTCT 


1440 


CTAGAATCTT 


TGAATTCATT 


TGTTAGGGGT 


TCAATTGGGA 


TTCGAGTAAT 


AAGCAAGGCG 


1500 


GATGGTACAA 


CTCTCTCATC 


AACTTAGTTC 


CGGACTTGGC 


T AAAGC TGG A 


GTTACTCATG 


1560 


TTTGGTTGCC 


ACCATCATCT 


CACTCCGTTT 


CTCCTCAAGG 


TAATTTTCGG 


AGTGATTGTG 


1620 
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ACCTAGTAAT CCAATGAAGT CAAAATAACC ACGGAAGATT AGAGTCTAAA TTTTAATGAA 1680 

AATAGTTCAG ACAAGTTAAT GACCAACTTA TATATTAGTT CAATCCATAA AATTTGATGT 174 0 

AGTAGTTACA AAATGGAATT GCTTGAAGGC TTATGCCATG TTTTATGCCA GGTTATATGC 1800 

CAGGAAGGTT GTATGACTAG GATGCTTCCA AGTTTGGAAA TCAGCAACAA CTGAAAACTC 1860 

TTATTAAGGC TTTAACATGA CCACGGGATC AAATCGGTTG CTGATATAGT GATAAATCAT 1920 

AGAACTGCTG ATAACAAAGA TAGCAGGGGA ATATACAGCA TCTTTGAAGG AGGAACATCT 1980 

GATGACCGGC TTGATTGGGG TCCATCTTTC ATTTGCAGGA ACGACACACA ATATTCTGAT 2040 

GGCACGGGGA ATCCAGACAC GGGTTTGGAC TTTGAACCTG CACCTGATAT CGATCATCTT 2100 

AAT AC GAG AG TGCAGAAAGA GTTATCAGAC TGGATGAACT GGCTGAAATC TGAAATTGGA 2160 

TTTGATGGTT GGCGTTTCGA TTTTGTTAGG GGATATGCAC CTTGCATTAC CAAAATTTAT 222 0 

ATGGGAAACA CGTCCCCGGA TTTTGCTGTT GGTGAATTGT GGAACTCTCT TGCTTATGGC 22 80 

CAGGACGGGA AACCGGAATA TAACCAGGAC AATCATAGAA ATGAGCTAGT TGGTTGGGTA 2340 

AAAAATGCGG GGCGGGCTGT AACAGCTTTT GATTTTACAA CAAAGGGAAT TCTTCAAGCT 2400 

GCAGTTCAAG AAGAGTTATG GAGATTGAAG GATCCCAATG GAAAACCTCC TGGGATGATC 2460 

GGTGTTTTGC CTCGAAAAGC TGTGACTTTT ATCGATAATC ATGATACTGG ATCGACACAA 2 52 0 

AATATGTGGC CTTTCCCTTC AGACAAAGTT ATGCAAGGAT ATGCATACAT TCTTACTCAT 2 5 80 

CCAGGAATCC CATCCGTGGT AAAAAAAATA AATAAATTCT TTCTACATAT CTCATTGTTT 2 64 0 

TCTATTTTAC AAGAAATTTA TATTCTTTTC CAGGGGATTT GAGAAACTCG GCCTGTGGGA 27 00 

GTTTGCTCAC ATTGCCAGTC TCGTAATCCA TAAACAAACA CTCAAACTCT GAGTGTGCAC 27 60 

ATCTAGACAC CTCAACTCGT TTTTCACCGT GTTAATTGAA CACTTCAACT TACAAAATGA 2 820 

TCGTGTAGCA CCTCCAAAAA TTATGTGTCA CAATTAGCCA CGTGCGAGAT ACACGAAAAT 2 880 

GAGTTGGAGT AGTTAGTTGC CAAATAAAAC CAAGCTGAGG TGTCTAAATG TGCACNCTCA 2940 

AAGTNGGATG TTTACTTGGC AGCTGAGGCC GAGGCCATGT TTGANTGTTA TGCTTATAGG 3000 

ATATGACACA TTTGTTTCCG ATTAGCTGAG GANTTGATTA AATCCTNGTT TTNGTTNGCA 3 060 

GTTTNATNAC CATTNCTTTG ATNGGGGCTN CNAGGATGGA ATTNCAGCAC TAANCTCTAT 3120 

TAGGAAAAGG AATAGGATTT GTGCANCAAG CAATGTGCAA ATAATGGCTC CTGATTCTGA 3180 

ATCTTTATAT ANCAATGGAT CATCACAAAA TCATTGTCAA GATTGGACCA AAACTTGATC 3240 

TTGGAAATCT TATTCCACCT AATTATGAGG TGGCAACTTC TGGACAAGAC TATGCTGTAT 33 00 

GGGAGCAAAA GGCATAATCA TATTGTACCA CACTAAAAGG GACCATGGCC ACAATGGTTC 33 60 

TCATTAGTGT TAATGTTATA TGATTGAAAA TGTAATTTAT ATTGACATAA TGAAGGCCAA 342 0 

AAATTCAAGA AATTATAAAC AATTCAATAG TCCTTGCTCA ATTCACAATT ACATTATGAC 3480 

TTCTCTATTG CAAACTAGTT TGGGTCCACA TTATTGTCTC CTAAAATTTT ACAACATTTC 3 540 

TTAAGGGAAC TTAATTAGTT ACAGTGAACA TATGTTGAAA TTACCCTTTA TCCCCTTACA 3 600 

ATTGATTTAA TAAATATTTC CCCTATCCCT TTGGTAGTTG GTTAGAGTTA TAAGTAACGT 3 660 
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AGAGATTAGT 


TATAAGAGAA 


TTTATGTATT 


ATTATGCAGA 


TGTTTAGTTA 


TATCGATTTT 


3720 


AGTTATTTAT 


ATGTTGATTA 


TTTCACCTTC 


AATAATGCAT 


ATAAAGATGG 


TAAATGATTG 


3780 


GATTGATCGA 


ATTCGAATGA 


GTTTGAATAT 


G AAC TAATC T 


TCAAATTTAA 


TATAAATTTT 


3840 


TTTTGTCAAC 


ATCTATAGCC 


AAACGGCTCC 


AAAACAATAA 


ATAATTTACA 


TTTATTGTAG 


3900 


TATTTTATTT 


AAAATGGGAT 


NTTCCTCATC 


CCACTTGTAC 


CAGTTGAAAC 


CCTAATAATA 


3960 


AGCCAATCCA 


ACCGTCAAAA 


TTACAAATTT 


TGAAAATTGC 


GCTCCTCACA 


GTTCTCCCCT 


4020 


ATTCAGATTT 


GATTCATTCT 


CTTCATTTTT 


TGTTTTCACA 


TTTTACCTCT 


AAATC AAC TC 


4080 


GAGTCCCTTT 


GTTCAAATGG 


GTGC TAATC A 


CAGCCGTGAA 


GATCTGGAGC 


TTTCTGATTC 


4140 


CGAGTCTGAA 


TCCGAATATG 


GGTCCGAGTC 


TCGAACAAGG 


GAGGAAGAGG 


AAGACGAAGA 


4200 


TAACTACTCA 


GATGCTAAAA 


CGACGCCGTC 


TTCCACTGAT 


CGGAAACAGA 


GCAAAACCCC 


4260 


GTCTTCTTTG 


GATGATGTTG 


AAGCAAAGCT 


GAAAGCTTTA 


AAGCTTAAGT 


ATGGTACTCC 


4320 


TCATGCTAAA 


ACCCCCACAG 


CGAAAAACGC 


TGTTAAACTT 


TACCTTCATG 


TTGGTGGGAA 


4380 


CACTGCGAAT 


TCCAAATGGG 


TAGTTTCTGA 


TAAGGTGACA 


GCTTATTCGT 


TTGTTAAATC 


4440 


GGGTAGTGAG 


GATGGATCGG 


ATGATGATGA 


AAATGAAGAA 


ACTGAGGAGA 


ATGCTTGGTG 


4500 


GGTTTTGAAA 


ATTGGGTCGA 


AGGTTCGGGC 


TAAGATTGAT 


GAGAATTTGC 


AGCTCAAGGC 


4560 


ATTTAAGGAG 


CAGAAAAGGG 


TGGATTTTGT 


GGCGAATGGG 


GTTTGGGCTG 


TGAGATTCTT 


4620 


TGGGGAGGAA 


GAGTATAAGG 


CGTTCATTGA 


CTTATATCAG 


AGCTGTTTGT 


TTGAGAATAC 


4680 


TTATGGGTTT 


GAGGCAAATG 


ATGAGAATAG 


AGTTAAGGTG 


TATGGTAAAG 


ACTTTATGGG 


4740 


GTGGGCAAAT 


CCAGAAGCTG 


CGGATGATTC 


AATGTGGGAG 


GATGCTGGGG 


ATAGCTTCGC 


4800 


GAAGAGCCCT 


GCGTCTGAAA 


AG AAG AC AC C 


TTTGAGGGTT 


AACCATGATT 


TGAGGGAGGA 


4860 


GTTTGAGGAG 


GCAGCTAAAG 


GAGGAGCTAT 


TCAGAGCTTG 


GCATTAGGTG 


CGTTGGATAA 


4920 


TAGTTTTCTT 


ATAAGTGATT 


CTGGAATTCA 


GGTTGTGAGG 


AACTATACTC 


ATGGAATAAG 


4980 


TGGAAAAGGT 


GTTTGTGTCA 


ATTTTGATAA 


GGAAAGGTCT 


GCTGTACCTA 


ATTCCACTCC 


5040 


AAGGAAAGCT 


CTACTTCTAA 


GAGCTGAGAC 


TAATATGCTT 


CTCATGAGTC 


CAGTGACTGA 


5100 


TAGAAAGCCT 


CACTCTCGGG 


GATTACATCA 


GTTTGATATC 


GAG AC TGGG A 


AGGTTGTTAG 


5160 


CGAGTGGAAG 


TTTGAGAAAG 


ATGGAACTGA 


TATCACGATG 


AGGGATATCA 


CTAATGATAG 


5220 


CAAAGGAGCT 


CAGATGGATC 


CTTCGGGGTC 


TACTTTCTTA 


GGGCTAGATG 


ATAACAGATT 


5280 


fi TGT AGG TGG 


GATATGCGTG 


ATCGGCATGG 


GATGGTCCAG 


AATCTAGTTG 


ATGAAAGTAC 


5340 


TCCTGTGCTG 


AATTGG AC TC 


AAGGACATCA 


ATTTTCGAGG 


GGAACTAACT 


TTCAGTGCTT 


5400 


TGCTACTACT 


GGTGATGGAT 


CAATTGTTGT 


TGGTTC AC TT 


GATGGCAAGA 


TTAGATTGTA 


5460 


CTCAAGCAGT 


TCCATGAGAC 


AGGCTAAAAC 


TGCTTTTCCA 


GGCCTTGGTT 


CTCCTATCAC 


5520 


TCATGTGGAT 


GTTACCTATG 


ATGGGAAGTG 


GATATTGGGG 


ACAACTGATA 


CTTACTTGAT 


5580 


ATTGATATGC 


ACCTTGTTTA 


TCGACAAGAA 


TGG AAC T AC T 


AAGACTGGTT 


TTGCTGGTCG 


5640 


CATGGGAAAT 


AAGATTTCCG 


CTCCAAGATT 


GTTAAAGCTA 


AACCCTCTCG 


ATTCACATAT 


5700 
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GGCTGGAGCT AACAAGTTCC GCAGTGCTCA ATTTTCATGG GTCACCGAGA ATGGGAAGCA 5760 

AGAGCGCCAC CTCGTTGCTA CTGTTGGGAA GTTTAGTGTG ATCTGGAATT TTCAACAGGT 5820 

GAAGGATGGT TCTCATGAGT GTTACCAGAA TCAGGTTGGG TTGAAGAGCT GCTATTGTTA 5880 

CAAGATAGTC CTAAGAGACG ACTCTATTGT AGAAAGTCGT TTCATGCATG ACAAGTACGC 5940 

TGTTTCTGAC TCACCTGAAG CACCACTGGC GGTAGCAACC CCCATGAAAG TCAGCTCATT 60 00 

CAGCATCTCT AGCAGGCGCT TACAAATTTG AACAATCATT CTGTTCATAT ACGCAACTTA 6060 

TTAGATTTAT CTGTAGCAGA ATTAGTGTCT CTCACACTAA GTAGCTTGAA AAACTGCACA 612 0 

TCTGCAAATC ATTTCCAGTT CAATGTATTA CTACTTTAGT TTAAAAACCT TAAAAGGCAH 6180 

TCTTCCAAAT TCTAGGTATC CTCACCTGAC ATTATTATTG TTGTAATAGC TAATTGTTGC 6240 

TTGCTCTAAA TCCCCGTTCA ATG 6263 



(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 684 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 



AATCAAAGAG 


GAATTNAATT 


CCNCAAAATT 


TCATCCATAG 


ATTTTGNGTC 


50 


TCTGAAAATT 


AAAGTG AC TT 


TGTAATCTGA 


AACCTAGAGT 


CCTCAACCAT 


100 


ATCATTGACC 


ATTAAGCCAT 


ACCCTTAAAT 


GTAGGGAATT 


TGAAGTTTTA 


150 


AAAACCACAC 


TTTGTTATTT 


ATTGGCCCAA 


ATACTCGATA 


ATCTTTACAT 


200 


TATTGAAAAT 


CAACATTCAA 


AAGGAACGAA 


CCTTCAATCA 


CACCATCAAT 


250 


GTCAACTTTC 


TTTTATTTTG 


GATAATCTAA 


GTTTTTAAAT 


TGCAGTAAAA 


300 


TNAAATAAAA 


CCCTAAACTT 


CTTCTAGGTT 


GAGACTTAGT 


AAATATGAAT 


350 


TATATAAAGA 


ATTCATGACA 


AATGAGACAT 


AAGAATAGTG 


CCAGCAAATT 


400 


ACTTTTTTGA 


TATCTTATCT 


GTGATATCGG 


AATTTTAACT 


ACCATAAATT 


450 


TATGAATGAA 


ATATCACTTA 


TCTATTAGAG 


AGGATTTAAT 


CTCCCTTATA 


500 


ATGACATTGA 


TAAAAGCAAG 


NACAAGTGCT 


CTTTATTTCT 


TAATTACAAA 


550 


TCC TTAAATA 


GATAAAAGCT 


ACGAATAACA 


TAATATCCTT 


AAATAGATAA 


600 


AAGCTACGAA 


TAACATAATA 


GTATATTACT 


CCNAATTATT 


TTGATTTATT 


650 


TAAAATG AC T 


CCACTAATCC 


TGATGTGGTC 


TAGG 




684 
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(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 662 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
< D ) TO PO LOGY : linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 



r* t> c* t* a n n r* 


^ X \J\?\J J- \ — -L 


GAAACAAAAT 


AATAGCAACA 


TACAAACCAC 


TGATATTGTA 


TCTAGCTCTC 


TCAAACACTT 


TTAAAATTGT 


CATATTATGA 


CCTACACAAC 


AACAACAACA 


CAAAGTGGAG 


CCTGAAGTCG 


AGAGTTTACG 


TCGAGATAAA 


AAAATTATTT 


TTAAAAGATC 


CAATAATTAA 


AAAAATATGA 


ATTAATAGCA 


ACAAAAATCT 


ATAACAACAA 


CAAGGTGCAG 


CGAAGTTGTG 


ATACTGTCAT 


AATAAAAATG 


AAAAATAAAT 


AACCATAAAA 


TATATCATAG 


AGCTCACTCC 


AATATTAAAA 


GAGAGAAAAA 


GCCATAATCC 


TTGAGCTTAG 


CTATTTATAA 


GATAAATAGA 


AAAAGAAATA 


ATAATTAAAC 


TAAGAGTGTA 


TT 





AACTTATTTG 


ACTCCTAAAC 


50 


CAAGTAAAAT 


TCAATAAAAT 


100 


TATTTCTGTT 


TTGTCTGTGT 


150 


ACGAATTTAG 


TGAAACTCTA 


200 


CGGGCCTTAT 


CACTATCTTT 


250 


ATCGACTTAA 


AC AAAC C AAA 


300 


AAGCAGTGTG 


GACCATATAT 


350 


AGCATTATTC 


CAACTAAGAT 


400 


ACACATATTT 


TGACAACATA 


450 


AAAAATGAAT 


ATATTAGAAC 


500 


AAATATTTTC 


CCACCACAAT 


550 


GTAAAAAAAA 


TGTTTTCTTG 


600 


ATAACCAATC 


AC TTC AC AAA 


650 






662 



( 2 ) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 

ATTTATTTTT AGGAAAAATT ATCTAAATAC ACATCTTATT TTACCATATA CTCTAAAAAT 6 0 



TCC 



63 
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( 2 ) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 7 : 

AATTATATTT AGGAAAAATT AC AT AAAT AC ACAACTTAAT ATATTATATT CTCTAAAATT 60 
TCC 63 

DATED this 25th day of September 1998 

THE UNIVERSITY OF QUEENSLAND 

By DA VIES COLLISON CAVE 
Patent Attorneys for the Applicants 
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lys-2 
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FIGURE 5 (i) 

9 81 TTTGAAATTTATGTATATATCTGTAGCATTAGAAACTATAAGAGTTGTTA 

MIIMIIIIIIMII I I 1 I I | 1 I I I 1 I I 1 I i I t 1 I I 1 1 1 I I i 1 1 1 1 I 

4 0 TTTGAAATTTATGTATTTATCTATAGCATTAGAAACTATAAGAGTTGTTA 



1030 Potato 
89 Tomato 



1031 GCTTCACTTGTCTTATTGTTGTGCTCAAAGCAACT . . . TCATCATACAGT 1077 

Mllllllll I II I I I I I I Ml Ml I M M I M llllllllllll 

GCTTCACTTGGCTTACTGTTGTGCTCAAAGCAACTTCATCATCATACAGT 



90 



139 



1078 ATGGTTTTTATATGCTCTTCC ATTATCACCGAACCTTATGATTATG . TGT 112 6 

IMIIIII I I I I I I I I I I 1 I I I I I I I I I II I I I I M 1 I I I I I I I I 
140 ATGGTTTTGATATGCTCTTCCATTATCACTGAGCCTTATGATTATGTTTT 189 

1127 ACGAGCTTATAATATTACTGATGGTGATTCAGTATTATGATTATGTCCTC 117 6 

I II Mill II Mill 1 1 1 1 1 1 1 1 1 1 M M II II II M IN I I Mill 

190 ACGAGCTTATAATATCACTGATGGTGATTCAGTATTGTGATTATGTCCTT 239 

1177 C ATTAATTATTCTGTTTCATACAAGTCGTGTAATTTGCTGTTTGTGATTG 12 2 6 

I II I II I I M II I I M II I I II I I I I I M M II I II I I M M M I I 
240 CGTTGATTATTCTGTTTCATACAAGTCGTGTAATTTGCTGTTTGTGACAG 2 89 

1227 TACGATAAATTGATTCAACCTTCTGCGGTGTTGGTTGAAGTTCAAGTAAA 1276 

MIIMI M M IMMMIMI III II 1 1 M I M II I I Mill 

290 TACGATAGATCGACTCAACCTTCTGAGGTATTAGTTGAAGTTCATGTAAA 3 39 
1277 TTAGCTTTATTTATCATAGTAGCATTTGATTATTGATGCTCTGTAGCTAA 132 6 

IMIIIII I M II I 1 1 1 1 1 1 1 1 1 1 II 1 1 II 1 1 M I I II M 1 1 I II II I I 

3 40 TTAGCTTTGTTTATCATAGTAGCATTTGATTATTGATGCTCTGTAGCTAA 3 89 
1327 TGATAAGCCATTGAAGGGAAGCAGAAATGGTAAAGCTTTCTAAAATGAAT 137 6 

1 1 1 1 1 M I M M I IMIIIII I 1 1 1 1 1 i I I IMIIIII 

3 90 TGATAAGCCATTGGAGGGAAGC AAGCTTTCT . AAATGAAT 42 8 

1377 CTACGAATGGATGATAAAGTTAATGAATATTGTTGATACTTCTGCAATCA 1426 

M II II 1 1 1 II 1 1 II 1 1 1 1 1 1 II MINI I Ml Mill II Ml III AnQ 

429 CTACGAATGGATGATAAAGTTCATGAATATTTTTGTTACTTCTGCAGTCA 47 8 
1427 GATTATGAGTTACTGAGTCTACTG . TTTTTTAAGCCTGTTTC AGATGATC 147 5 

Ml I II I M II Ml MM I M MIMM MIMMMMMIMM 

479 GATC ATGAGTTATTGAGTCT ATTGTTTTTTTAAGCCTGTTTC AGATGATC 52 8 
1476 GATCATCAACAACAACATATTCAGTGTAGTAGACATGATCGATCACTTTC 1525 

MIIMI II Ml III I II MUM I III MM 

52 9 C ATC ATC AGTAAC AAC ATACACGGTGTAGT . . CCC AAATCC ATCA 571 

1526 TAATTTTCGATTATGCACCCTCTTTTCTCCAATTTGGTC . . GTCTTCTTT 157 3 

1 1 1 II II I MIIIMI 1 1 II I II I M II II Ml 

572 TATGCACCTTCTTTTCTTCAATTTGGTCTTGTTTTTTTT 610 

1574 TTTTCATGATGTCACTGAATTATTCTCTGGTCGTCCCCACCATTCAGGAA 162 3 

M M M 1 1 II 1 1 1 1 II II I I Mill Ml 

611 TTTTCATGATGTCATTGAATT ATTCAAGAA 640 

1624 GTCACTTCGAGCATAATG . . . TGAAAACATCCACATTT . TTCAA 16 63 

Mil I I I I M I I I I M Mllll IN Mm 

641 GTCACTTCGAGCATAATGATTTTTCAAAATCCACCTTTGTTCAAGCACTA 690 

UQ4 06 
insertion 



_ 6/12 
FIGURE 5 (ii) 

16 64 ATCCAGC AGAATTTTC 167 9 

M I I II I Ml I II I I 

691 CCACGTCTTTTCATCTAGCCCACAACCGTGGTGGAGGATCTAGAATTTTC 740 

1680 ATCAAACGGGGTTCAACATTTAC . . . TACATGTATACACTCTGAAGTCTG 1726 

M IN II Mill MUM I I I MIMIII I I I II 

741 ATGAAA. .GGATTCAAAATTTACAAACATATATATACACTATACACTATG 7 88 

1727 AATCCACTAATTCTAGATGGTGCATCTGTGCCCCCACACTTGTGAAAGCT 1776 

Ml MIMIII 11 Ill IMMMIIMI I MINIMI 

7 89 AATCCACTAATACTAGATGGTGCACCTGTGCCCCCACTCATGTGAAAGCC 838 
1777 TATTCTCAATTTTTTATTTTCCAACAACTTGAATTCAGACCACACAACTC 1826 

Ml II IMMMIIMI Mill I Ml II I III III III III II I II 

839 TATTCTCAATTTTTTATTTTCC . AC AAC TT AAATAC AG AC C GC AC AAC TC 887 
1827 CCGTGTCTTGT ACGGTCAGCATCTGAGTGGAGAACTCAA .... 1865 

I II MIMIII li Ml: M Ml Mill II 

888 CCGTGTCTTGTGTGCTCGTCGCTCAGCATGCAAGTCGAGAAAAGAAAGAC 937 
1fl £c TTAAGTGACTTTAACG 1881 

186 Ill Ml I II I III 

9 38 CAAAACAATGAAAACTTTACGAAAAATCAAAAAGTTGAAGGACTTTAACG 987 
1882 TCGAGTTCTATAGTAAACAACCCCT ATATCTT 1913 

MIM III I III I MM M Ml M 

988 TCGAGATCTCTCGTAGAAAACCTCTTTTGTAAGGTTGCATACAATACTTT 1037 

1914 TTTTCAAGCATGTTAAGATTGCGAACACACTGA 1946 

MM II I III M I I Mill 

103 8 TTTTTCAG . ACTTTACTTATGGTATTATACTGAATATGTTATTGCTGTTA 1086 

-.947 AATTTCCAGGTCGTTAATCTTGTACC 1972 

Mil! I II 1 1 M 1 1 11 1 1 1 1 M 

1087 TAGTAGTTGAGTGACGTTTGAGGGAATTTCTAGTCCGTTAATCTTGTACT 1136 
1973 CAGTGTGTGTACTTTTAAAAAAAAAAGTCAGTTTTTTAGTCTCTAAAACA 2022 

MIMIII lllllll IIIIMIIIIMIMI 1 1 1 1 1 1 1 1 1 1 1 1 1 iio , 

1137 CAGTGTGTCTACTTTT . . . CAAAAAAGTCAGTTTTTCAGTCTCTAAAACA 1183 
2023 CATTTAAAT . AGAGTTTATTTG . CCATCTTTTGTTCCTCATACTAGACTT 2070 

MINIMI 1 1 1 1 i M MM 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 MM III 

1184 CATTTAAAT AAGAGTTTCTTTGCCCATCTTTTGTTCCTCATCCTAGGCTT 1233 

2071 CGGAGTCAACACAACACAACAACA 2094 

M I I II II I II M M I I II I II I 
1234 . GGAGTCAACACAACACAACAACA 1256 
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FIGURE 6 (i) 



53. 
101 
1S1 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
28S1 
2901 
2951 
3001 
3051 
3101 



o«eaocG» cttc-rsGT^ ^SSSSS. £££££ ££££££ 

tcta^wtc 2£i5£?S «Sa?Lta tcgttttoat 

SgE SSK SSSS 3SSS S5EESS 
JS ™ EE1 SS55SS SS SSSS 

GACTCAACCT TC TG AGG i AT |^f^^" ty-TaGCTAAT GATAAGCCAT 

?S^Htc CACAACCGTG GTGGAGGATC TAGAATTTTC ^AAAGGAT 
TCAAAATTTA CAAACATATA TATACACTAT ACACTATGAA JCCACTAATA 

£335 2SSSE sssss sssss 

S^tcgct cagcatgcaa gtcsagaaaa gaaagaccaa aacaatgaaa 
SSS £S£aaaaa cttgaaogac ™corcc ^ctc*cg 

^™ sis ssssss sssss ^| ? -tga 

JSSSSS aItSSagt ccgttaatct tgtactcag? gtotctacto 
t?caSaS ^SSrrc agtctctaaa acacatttaa ataagaottt 

rwraCCCAT CTTTTGTTCC TCATCCTAGG CTTGGAGTCA ACACAACAtA 

JSaISSa aStcSttt ttctgtttct ttacttctc? ctttatctct 

TCCTATGTTT gStCTTCGA CGGT6TTATT TCAOCTATCC A7CTCCAAAG 
I^rniii^r. ^T, rT rT"AAC TTTTCCTATG TATATGTATC TCTATGTTTA 
AACCTTATTT TTCTCT^AAt i j rZZ.i?** rWafl^TTCT CTAGAATCTT 
TYVrtCTACT^ GCTCAAGTAT ATAAAGAAAA GTTAG . TIT- 1 
JSSttcaS TGTTAGGGGT TCAATTGGGA ttcgagtaat aagcaaogcg 
^^JZlr-ll rT^TCATC AACTTAGTTC CGGACTTGGC TAAACCTGOA 

Si Si ssss sssss SKS 

T^riAiW ifiat5r"eTAAA TTTTAATGAA AATAGTTCAG ACAAGTTAAT 

££££££ t^SSg^t SatJcataa AATTTGATGT agtagttaca 
SaSSSw SttSaggc ttatgccatg ttttatgcca w*2"mkk 

^^^I^r rtTATCACTAG GATGCTTCCA AGTTTGGAAA TCAGCAACAA 

^55i^A?^ SatoSaS? tctaacatoa ccacgggatc aaatcgg?tg 

K^BSSSMSS 

SSiSS ss sssss§ =s SJSS 

GGTGAA^TGT GGAACTCTCT TGCTTATGGC CAGGACG3GA AACCOGAA'rA 

SsS SSSSS SSS5 SSSSS 

Si S sssssss ssssss sssss 
ssss ss s»»»« sssss 

MSCAAOSM ATCCATACAT TCTTACTCAT CCAOOAATCC a"J=?I?> 

Hssssk ssss sssss sssss ggsg 
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FIGURE 6 (ii) 

3^1 CAATGTGCAA ATAATGGC7C CTGATTCTGA ATCTTTATA7 ANCAATGGAT 

32 01 C-A«CACAAAA TCATT3TCAA GATTGCACCA AAACTTGATC TT3GAAATC? 

3251 TATTCCACCT AATTATGAGG TGCCAACTTC TGGACAACAC TATCCTGTAT 

3301 GGGAGCAAAA GGCATAATCA TATTGTACCA CACTAAAAOG GACCATGGCC 

3351 ACAATGGTTC TCATTAGTGT TAATGT7ATA TGATTGAAAA TGTAATTTAT 

3401 ATTGACATAA TGAAGGCCAA AAATTCAAGA AATTATAAAC AATTCAATAG 

3451 TCCTTGCTCA ATTCACAATT ACATTATGAC TTCTCTATTG CAAACTAGTT 

^501 TGGGTCCACA TTATTGTCTC CTAAAATTIT ACAACATTTC TTAAGGGAAC 

3551 TTAATTAGTT ACAGTGAACA TATGTTGAAA TTACCCTTTA TCCCCTTACA 

3601 ATTGATTTAA TAAATATTTC CCCTATCCCT TTOOTAGTTG GTTAGAGTTA 

36S1 TAAGTAACGT AGAGATTAGT TATAAGAGAA TTTATGTATT ArTATGCAGA 

3701 TGTTTAGTTA TATCGATTTT AGTTATTTAT ATGTTGATTA CTTCACCTTC 

3751 AATAATGCAT ATAAAGATGG TAAATGATTG GATTGATCGA ATTCGAATGA 

3801 GTTTGAATAT GAACTAATCT TCAAATTTAA TATAAATTT? TTTT&TCAAC 

3851 ATCTATAGCC AAACGGCTCC AAAACAATAA ATAATTTACA TTTATTGTAG 

3901 TATTTTATTT AAAATGGGAT NTTCCTCATC CCACTTOTAC CAGTTGAAAC 

3951 CCTAATAATA AGCCAATCCA ACCGTCAAAA TTACAAATTT TGAAAAWGC 

4001 GCTCCTCACA GTTCTCCCCT ATTCAGATTT GATTCATTCT CTTCATTTT? 

4051 TGTTTTCACA TTTTACCTCT AAATCAACAA AATTCCCTTT STTCAAAItafi Dem ATG 

Aim GTGCT A W rAnficGTC rM ^'^ggaGC TTTCTGATTC HMgTCTfiAA 

}%i Wtgaatatg GGTce c^ ryry tosaacaagg nAOGAAGAGG MfiACfiMffA 
rttc-rinvrA gatc-t™**! reAC fiCf^ i"rcrArmAT PfflrAMCAfffl 

GCAAAACCCC GTCTT C *™ GATGATGTTG AAGCAAAGCT GMAKTTTA 

fis^n^ i^w^y- ^a^aa acecccMw 

4**1 TrtTTAAACTT rACC ™T*,™ CACTGCGAAT TCCMATGGG 

ami raonrrff" TAAGG ^ ArA GrrTATTCGT TTGT7MMK! flggmgngftg 
iiii SreeiTrtaa atsato * ™* AAATflAAGflfl AC^AmAGA ATflGTIGGTG 
dVot SOTTCryiAAA ATTSSO T^ a AnGTTtGGGC TAAGATTGAT (WMA . T7T6C 
AGCTC A H** ATVTAAGGAG r*GM*MQG TQGATTTTQT ggCgAA^iS 
am ama ^ r™ raifijiw™* TgaacAGflft* GflfiTATAAQr? CGT7CA.TTGA 

A**l rTTATATTAS ACCTOT^T TTflAfiAATAC TTA TTrfrOTTT (WrflCAAAW 

47m jTV^^AATAg AGrra^^^ TA-rreTAAAO ACTTTATOgG— GJPfyTfffilflMT 
47m ccAff A^'3 n y: ^ ■t'^'wg Afl TTrT r TfT l f** ( * <«refrcgsQ flTAfiCTTCOC 
jam GAAGAcrrcr gcgtctt .* ** *r.A*GACArr rrrGAGGGTT MiGGMORn 

ABSl •KSAGGGAGnA GTTTC *MAG GCAGrTAAAG OAflSAfiTTAT TCAQAGCTTS 

dini GCAWtrsyrc pgttgga tm ™r*w?cw ataagtgatt CTGGAA7TCA 

AOS, GGTTGTGAGG AACTA^W ATGGAATAAG TGfrAAAAIrfiT gnTOTgCCfl 

r^, mmmwtt ficmfufrim *"™>ctt 

*rt<i rTACTTTTAA MflMWAflfl ff taawhotw PTTATSAGrC CAGTOAGCfiA 

l%\ V^aagt^ iSi^Sg rlrrarflTTA fiTTTOAWTT M™™** 
ci ci aS-rwrfAo asAara o *™ ^t^caaag a tggaactCA TArCACffATg 

AflQGAiC A ^ CTJATfiATAl? CAAAflfiAgCT f fl^^A^ ^nrffj»gTC 
h" TArWTy-7vPA BOBCTA^ nva iwa^flAW' CTGTASGTGG ffATflTfiCGTS 
ArrfiflgAov^ ^Tr^A fT ..^iflww AinAM<vr*r rCCTGTGCTG 
lit "ir^rTr AACCACATffl ATTTfyflAgQ ^AACTAACT TZgAggaai 
TCCTA^r^ amaAT Qm T r"^ 1 ^ TGG^rCAcrT' ffATCgCAflgfl 

IZrT^T n ^»^r a^ctaW Tg^Trreca 

ggni ggccwggtt c^T ATCfir t^atc^a'p nrraerroTB ATflffflMgrfl 
^ATOMras A«AC«a M rTTACTr ffAT ^rrpftTftrac flcc ggzm 
^7 TraACA A ^A .Tv^AArrAcr AaoarTUfflT TTnrmw™ (TATPffliAMT 
cfl?; 1^7 ^ -^ nrY-»«.w ffWAHAflTTA AArrrrCTCff ATTCAC ASag 
g7 r)7 rtr^re^ryvp AArAA flT ^r mnvGCTCA ATTTTCATG(r CrTCflCCgflga 
iwi A^a a nff™ afl*fln«?rAr r^srrogTA rTftTTOgfiM ffTTT^grSTC 
ATV77YrpA ATT rrc ^ arA^yP fiAAf^ A rffffT 'nr^TrtAaT GTTACCAGAA 

c«i t"a;^?^ t^aagaqct rrrflnr^' ctaagagacg 
^atX ^aaa^«; ^7^™ ACA^grflrnr TgnCTatc 

^i T^gCT g AA« rArrArrflcc r^TA^AAO? CCr«TTM<»K? ™CX ag 
fpni %7rMr^r agcag g tv^ TArAAArrrw AArAATCATT CTOTTCAMf 
fpni AgQ C Aar»PTA TTAfiATTTAT OTSTAfyAflA ATrflffTRTCT CTTflCACZflfl 
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FIGURE 6 (iii) 

€101 ^nrnWAA ^^tt^mca Tr T^*^ ATTTCCAGTT CAAT9TVSXA 

Si?™T gjg^jg idmg ggg^ g 

rrr*rrrG*o at t *™t™ TTr-TAATAdV TMTTfrTTTr^ Ttt><TC r 
^3*7 Tr^OCTTCA ATC 
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FIGURE 7 
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